40 research outputs found
Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks
Large language models have demonstrated robust performance on various
language tasks using zero-shot or few-shot learning paradigms. While being
actively researched, multimodal models that can additionally handle images as
input have yet to catch up in size and generality with language-only models. In
this work, we ask whether language-only models can be utilised for tasks that
require visual input -- but also, as we argue, often require a strong reasoning
component. Similar to some recent related work, we make visual information
accessible to the language model using separate verbalisation models.
Specifically, we investigate the performance of open-source, open-access
language models against GPT-3 on five vision-language tasks when given
textually-encoded visual information. Our results suggest that language models
are effective for solving vision-language tasks even with limited samples. This
approach also enhances the interpretability of a model's output by providing a
means of tracing the output back through the verbalised image content.Comment: Accepted at ACL 2023 Finding
Combining Textual Features for the Detection of Hateful and Offensive Language
The detection of offensive, hateful and profane language has become a critical challenge since many users in social networks are exposed to cyberbullying activities on a daily basis. In this paper, we present an analysis of combining different textual features for the detection of hateful or offensive posts on Twitter. We provide a detailed experimental evaluation to understand the impact of each building block in a neural network architecture. The proposed architecture is evaluated on the English Subtask 1A: Identifying Hate, offensive and profane content from the post datasets of HASOC-2021 dataset under the team name TIB-VA. We compared different variants of the contextual word embeddings combined with the character level embeddings and the encoding of collected hate terms
Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict
The ongoing Russo-Ukrainian conflict has been a subject of intense media
coverage worldwide. Understanding the global narrative surrounding this topic
is crucial for researchers that aim to gain insights into its multifaceted
dimensions. In this paper, we present a novel dataset that focuses on this
topic by collecting and processing tweets posted by news or media companies on
social media across the globe. We collected tweets from February 2022 to May
2023 to acquire approximately 1.5 million tweets in 60 different languages.
Each tweet in the dataset is accompanied by processed tags, allowing for the
identification of entities, stances, concepts, and sentiments expressed. The
availability of the dataset serves as a valuable resource for researchers
aiming to investigate the global narrative surrounding the ongoing conflict
from various aspects such as who are the prominent entities involved, what
stances are taken, where do these stances originate, and how are the different
concepts related to the event portrayed.Comment: Dataset can be found at https://zenodo.org/record/804345
Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures
Hakimov S. Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures. Bielefeld: Universität Bielefeld; 2019.The task of answering natural language questions over structured data has received wide
interest in recent years. Structured data in the form of knowledge bases has been available
for public usage with coverage on multiple domains. DBpedia and Freebase are such knowledge
bases that include encyclopedic data about multiple domains. However, querying such
knowledge bases requires an understanding of a query language and the underlying ontology,
which requires domain expertise. Querying structured data via question answering systems
that understand natural language has gained popularity to bridge the gap between the data
and the end user.
In order to understand a natural language question, a question answering system needs
to map the question into query representation that can be evaluated given a knowledge base.
An important aspect that we focus in this thesis is the multilinguality. While most research
focused on building monolingual solutions, mainly English, this thesis focuses on building
multilingual question answering systems. The main challenge for processing language input
is interpreting the meaning of questions in multiple languages.
In this thesis, we present three different semantic parsing approaches that learn models
to map questions into meaning representations, into a query in particular, in a supervised
fashion. Each approach differs in the way the model is learned, the features of the model, the
way of representing the meaning and how the meaning of questions is composed. The first
approach learns a joint probabilistic model for syntax and semantics simultaneously from the
labeled data. The second method learns a factorized probabilistic graphical model that builds
on a dependency parse of the input question and predicts the meaning representation that is
converted into a query. The last approach presents a number of different neural architectures
that tackle the task of question answering in end-to-end fashion. We evaluate each approach
using publicly available datasets and compare them with state-of-the-art QA systems
Named Entity Recognition and Disambiguation using Linked Data and Graph-based Centrality Scoring
Hakimov S, Oto SA, Dogdu E. Named Entity Recognition and Disambiguation using Linked Data and Graph-based Centrality Scoring. In: SIGMOD, SWIM 2012. 2012: 4.Named Entity Recognition (NER) is a subtask of informationextraction and aims to identify atomic entities in text that fall intopredefined categories such as person, location, organization, etc.Recent efforts in NER try to extract entities and link them tolinked data entities. Linked data is a term used for data resourcesthat are created using semantic web standards such as DBpedia.There are a number of online tools that try to identify namedentities in text and link them to linked data resources. Althoughone can use these tools via their APIs and web interfaces, they usedifferent data resources and different techniques to identify namedentities and not all of them reveal this information. One of themajor tasks in NER is disambiguation that is identifying the rightentity among a number of entities with the same names; forexample “apple” standing for both “Apple, Inc.” the company andthe fruit. We developed a similar tool called NERSO, short forNamed Entity Recognition Using Semantic Open Data, toautomatically extract named entities, disambiguating and linkingthem to DBpedia entities. Our disambiguation method is based onconstructing a graph of linked data entities and scoring them usinga graph-based centrality algorithm. We evaluate our system bycomparing its performance with two publicly available NER tools.The results show that NERSO performs better
Recommended from our members
Classification of important segments in educational videos using multimodal features
Videos are a commonly-used type of content in learning during Web search. Many e-learning platforms provide quality content, but sometimes educational videos are long and cover many topics. Humans are good in extracting important sec-tions from videos, but it remains a significant challenge for computers. In this paper, we address the problem of assigning importance scores to video segments, that is how much information they contain with respect to the overall topic of an educational video. We present an annotation tool and a new dataset of annotated educational videos collected from popular online learning platforms. Moreover, we propose a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features. Our experiments investigate the impact of visual and temporal information, as well as the combination of multimodal features on importance prediction
Recommended from our members
TIB's visual analytics group at MediaEval '20: Detecting fake news on corona virus and 5G conspiracy
Fake news on social media has become a hot topic of research as it negatively impacts the discourse of real news in the public. Specifi-cally, the ongoing COVID-19 pandemic has seen a rise of inaccurate and misleading information due to the surrounding controversies and unknown details at the beginning of the pandemic. The Fak-eNews task at MediaEval 2020 tackles this problem by creating a challenge to automatically detect tweets containing misinformation based on text and structure from Twitter follower network. In this paper, we present a simple approach that uses BERT embeddings and a shallow neural network for classifying tweets using only text, and discuss our findings and limitations of the approach in text-based misinformation detection
Recommended from our members
Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features
In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first prob-lem, claim check-worthiness prediction, we explore the fusion of syntac-tic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similar-ity, and perform KD-search to retrieve verified claims with respect to a query tweet